Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Slight-pause marks boundary identification based on conditional random field
MO Yiwen, JI Donghong, HUANG Jiangping
Journal of Computer Applications    2015, 35 (10): 2838-2842.   DOI: 10.11772/j.issn.1001-9081.2015.10.2838
Abstract429)      PDF (786KB)(402)       Save
The boundary identification of punctuation marks is an important research field of natural language processing. It is the basis of the application of word segmentation and phrase chunking. In order to solve the problem that the boundary identification of Chinese slight-pause marks which split the coordinate words and phrases in Chinese, the Conditional Random Field (CRF) that used for sequence segmentation and labeling was adopted for slight-pause marks boundary identification. At first, the slight-puase marks boundary recognition task was described in two types, and then the slight-puase marks corpus tagging method and process and feature selection were studied. According to the methods of corpus recommendation and ten-fold cross validation, a series of experiments were carried out in slight-pause marks. The experimental result shows that the proposed method plays an effective role in slight-pause marks boundary identification with selected boundary identification features. And F-measure of boundary identification increased by 10.57% on baseline as well as the F-measure of words divided by slight-pause marks achieves 85.24%.
Reference | Related Articles | Metrics